POS Multi-tagging Based on Combined Models

نویسندگان

  • Yan Zhao
  • Gertjan van Noord
چکیده

In the POS tagging task, there are two kinds of statistical models: one is generative model, such as the HMM, the others are discriminative models, such as the Maximum Entropy Model (MEM). POS multi-tagging decoding method includes the N-best paths method and forward-backward method. In this paper, we use the forward-backward decoding method based on a combined model of HMM and MEM. If P(t) is the forward-backward probability of each possible tag t, we first calculate P(t) according HMM and MEM separately. For all tags options in a certain position in a sentence, we normalize P(t) in HMM and MEM separately. Probability of the combined model is the sum of normalized forward-backward probabilities P norm(t) in HMM and MEM. For each word w, we select the best tag in which the probability of combined model is the highest. In the experiments, we use combined model and get higher accuracy than any single model on POS tagging tasks of three languages, which are Chinese, English and Dutch. The result indicates that our combined model is effective. 1. Motivation Being different from POS single-tagging, POS multitagging can assign more than one single best POS tag to a word in a sentence, according to the rank of probability of each tag calculated by a certain statistical model. A common usage of POS multi-tagging is a pre-processing part for a parser to increase the accuracy in comparison with single-tagging. Is single-tagging or multi-tagging suitable for parser? It depends on the kind of parser. In the experiments of PCFG parsing (Charniak and Carroll, 1996) and RASP parser(Watson, 2006), single-tagging is preferable to a multi-tagging, because multi-tagging provides only a minor improvement in accuracy, but with a significant loss in efficiency. On the contrary, for a parser based on highly lexicalized grammars, such as CCG parser and Alpino parser(Prins and van Noord, 2001), the accuracy of the single-tagging is only about 92% to 94% due to the large number of tags (hundreds of or thousands of tags), far below the current 97% accuracy in English POS tagging. Multi-tagger has been shown to be quite necessary in such two parsers. For other language, such as Chinese, the POS tagging is still not good enough due to the relatively small size training corpus and different annotation guidelines, so the multi-tagging is also promising for some further NLP applications. POS tagging is one of the best-studied applications in the statistical NLP domain. There are two kinds of statistical models: one is generative model, such as HMM(Brants, 2000), and the other is discriminative model, such as Maximum Entropy(ME) model(Ratnaparkhi, 1996). In multitagging task, (Prins and van Noord, 2001) used forwardbackward method based on HMM in Dutch corpus, and (Curran et al., 2006) used the same forward-backward method based on ME model. In this paper, for POS multitagging task, we test N-best paths and forward-backward method on three languages separately, and combined HMM and ME model based on forward-backward method. In methodology, we firstly introduce HMM and MEM briefly; Then, describe the two decoding methods: N-best paths and forward-backward method; lastly, we give the detail about how to combine HMM and ME model based on forward-backward frame. In the experiment section, I compare four kinds of multi-tagging methods based on HMM and MEM. The last section is conclusion. 2. Methodology 2.1. HMM and MEM POS tagging may be described as a decoding process of a noisy-channel. A sequence of POS tags , which is generated by a source with probability, is transmitted through a noisy channel. The output of the channel is a sequence of words with conditional probability . POS tagging need to covert output word sequence into the original input tag sequence . This task can be accomplished by finding that maximizes the probability . T̂ = argmax T P (T |W ) (1) Usually, There is not enough corpus in which we can estimate the probability directly, So Bayes theorem is applied to swap the order of dependence between the tag sequence T and the word sequenceW . P (T |W ) = P (T,W ) P (W ) = P (W |T )P (T ) P (W ) (2) Eliminating the normalizing constantP (W ) , the decoding is equivalent to T̂ = argmax T P (W |T )P (T ) (3) P (W |T ) can be calculated by the state-specific observation probability. P (T ) can be estimated as the product of transition probability, as defined in formula (4): P (T ) = P (t1, · · · , ti−1) N ∏ i=n P (ti|ti−n+1, · · · , ti−1) (4)

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

سیستم برچسب گذاری اجزای واژگانی کلام در زبان فارسی

Abstract: Part-Of-Speech (POS) tagging is essential work for many models and methods in other areas in natural language processing such as machine translation, spell checker, text-to-speech, automatic speech recognition, etc. So far, high accurate POS taggers have been created in many languages. In this paper, we focus on POS tagging in the Persian language. Because of problems in Persian POS t...

متن کامل

برچسب‌گذاری ادات سخن زبان فارسی با استفاده از مدل شبکۀ فازی

Part of speech tagging (POS tagging) is an ongoing research in natural language processing (NLP) applications. The process of classifying words into their parts of speech and labeling them accordingly is known as part-of-speech tagging, POS-tagging, or simply tagging. Parts of speech are also known as word classes or lexical categories. The purpose of POS tagging is determining the grammatical ...

متن کامل

From Word Segmentation to POS Tagging for Vietnamese

This paper presents an empirical comparison of two strategies for Vietnamese Part-of-Speech (POS) tagging from unsegmented text: (i) a pipeline strategy where we consider the output of a word segmenter as the input of a POS tagger, and (ii) a joint strategy where we predict a combined segmentation and POS tag for each syllable. We also make a comparison between state-of-the-art (SOTA) featureba...

متن کامل

A Multi-Neuro Tagger Using Variable Lenghts of Contexts

This paper presents a multi-neuro tagger that uses variable lengths of contexts and weighted inputs (with information gains) for part of speech tagging. Computer experiments show that it has a correct rate of over 94% for tagging ambiguous words when a small Thai corpus with 22,311 ambiguous words is used for training. This result is better than any of the results obtained using the single-neur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010